Functional Dimension Reduction for Chemometrics

نویسندگان

  • Tuomas Kärnä
  • Amaury Lendasse
چکیده

High dimensional data are becoming more and more common in data analysis. This is especially true in fields that are related to spectrometric data, such as chemometrics. Due to development of more accurate spectrometers one can obtain spectra of thousands of data points. Such a high dimensional data are problematic in machine learning due to increased computational time and the curse of dimensionality (Haykin, 1999; Verleysen & François, 2005; Bengio, Delalleau, & Le Roux, 2006). It is therefore advisable to reduce the dimensionality of the data. In the case of chemometrics, the spectra are usually rather smooth and low on noise, so function fitting is a convenient tool for dimensionality reduction. The fitting is obtained by fixing a set of basis functions and computing the fitting weights according to the least squares error criterion. This article describes a unsupervised method for finding a good function basis that is specifically built to suit the data set at hand. The basis consists of a set of Gaussian functions that are optimized for an accurate fitting. The obtained weights are further scaled using a Delta Test (DT) to improve the prediction performance. Least Squares Support Vector Machine (LS-SVM) model is used for estimation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Review of robust multivariate statistical methods in high dimension.

General ideas of robust statistics, and specifically robust statistical methods for calibration and dimension reduction are discussed. The emphasis is on analyzing high-dimensional data. The discussed methods are applied using the packages chemometrics and rrcov of the statistical software environment R. It is demonstrated how the functions can be applied to real high-dimensional data from chem...

متن کامل

Regularized Tensor Factorizations and Higher-Order Principal Components Analysis

High-dimensional tensors or multi-way data are becoming prevalent in areas such as biomedical imaging, chemometrics, networking and bibliometrics. Traditional approaches to finding lower dimensional representations of tensor data include flattening the data and applying matrix factorizations such as principal components analysis (PCA) or employing tensor decompositions such as the CANDECOMP / P...

متن کامل

Analysis of Censored Survival Data with Dimension Reduction Methods‎: Tehran Lipid and Glucose Study

 ‎Cardiovascular diseases (CVDs) are the leading cause of death worldwide‎. ‎To specify an appropriate model to determine the risk of CVD and predict survival rate‎, ‎users are required to specify a functional form which relates the outcome variables to the input ones‎. ‎In this paper‎, ‎we proposed a dimension reduction method using a general model‎, ‎which includes many widely used survival m...

متن کامل

Bayesian Sparse Tucker Models for Dimension Reduction and Tensor Completion

Tucker decomposition is the cornerstone of modern machine learning on tensorial data analysis, which have attracted considerable attention for multiway feature extraction, compressive sensing, and tensor completion. The most challenging problem is related to determination of model complexity (i.e., multilinear rank), especially when noise and missing data are present. In addition, existing meth...

متن کامل

Sparse Higher-Order Principal Components Analysis

Traditional tensor decompositions such as the CANDECOMP / PARAFAC (CP) and Tucker decompositions yield higher-order principal components that have been used to understand tensor data in areas such as neuroimaging, microscopy, chemometrics, and remote sensing. Sparsity in high-dimensional matrix factorizations and principal components has been well-studied exhibiting many benefits; less attentio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009